AITopics | greedy coordinate descent

In this paper, we propose a coordinate-wise version of the power method from an optimization viewpoint. The vanilla power method simultaneously updates all the coordinates of the iterate, which is essential for its convergence analysis. However, different coordinates converge to the optimal value at different speeds. Our proposed algorithm, which we call coordinate-wise power method, is able to select and update the most important k coordinates in O(kn) time at each iteration, where n is the dimension of the matrix and k n is the size of the active set. Inspired by the "greedy" nature of our method, we further propose a greedy coordinate descent algorithm applied on a non-convex objective function specialized for symmetric matrices. We provide convergence analyses for both methods. Experimental results on both synthetic and real data show that our methods achieve up to 23 times speedup over the basic power method. Meanwhile, due to their coordinate-wise nature, our methods are very suitable for the important case when data cannot fit into memory. Finally, we introduce how the coordinatewise mechanism could be applied to other iterative methods that are used in machine learning.

artificial intelligence, machine learning, power method, (13 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

Add feedback

Asynchronous Parallel Greedy Coordinate Descent

Yang You, Xiangru Lian, Ji Liu, Hsiang-Fu Yu, Inderjit S. Dhillon, James Demmel, Cho-Jui Hsieh

Neural Information Processing SystemsMar-23-2026, 07:46:39 GMT

In this paper, we propose and study an Asynchronous parallel Greedy Coordinate Descent (Asy-GCD) algorithm for minimizing a smooth function with bounded constraints. At each iteration, workers asynchronously conduct greedy coordinate descent updates on a block of variables. In the first part of the paper, we analyze the theoretical behavior of Asy-GCD and prove a linear convergence rate. In the second part, we develop an efficient kernel SVM solver based on Asy-GCD in the shared memory multi-core setting. Since our algorithm is fully asynchronous--each core does not need to idle and wait for the other cores--the resulting algorithm enjoys good speedup and outperforms existing multi-core kernel SVM solvers including asynchronous stochastic coordinate descent and multi-core LIBSVM.

algorithm, artificial intelligence, machine learning, (15 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)

Add feedback

64f1f27bf1b4ec22924fd0acb550c235-Paper.pdf

Neural Information Processing SystemsFeb-12-2026, 23:21:21 GMT

algorithm, atom, computation, (13 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
North America > Canada > Quebec > Montreal (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)

Industry:

Health & Medicine > Therapeutic Area (0.47)
Health & Medicine > Diagnostic Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

Add feedback

Accelerated Stochastic Greedy Coordinate Descent by Soft Thresholding Projection onto Simplex

Neural Information Processing SystemsNov-21-2025, 11:32:22 GMT

PrOjection (SOTOPO)" is proposed to exactly solve an In order to improve the convergence rate and reduce the iteration cost further, two important strategies are used in first-order methods: Nesterov's acceleration and stochastic optimization. Nesterov's acceleration is referred to the technique that uses some algebra trick to accelerate first-order algorithms; while stochastic optimization is referred to the method that samples one training This work is supported by the National Natural Science Foundation of China under grant Nos.

algorithm, artificial intelligence, machine learning, (16 more...)

Neural Information Processing Systems

Country:

Asia > China (0.24)
North America > United States > California > Los Angeles County > Long Beach (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Multivariate Convolutional Sparse Coding for Electromagnetic Brain Signals Tom Dupré La T our

Neural Information Processing SystemsNov-20-2025, 16:56:16 GMT

In this paper, we propose to learn dedicated representations of such recordings using a multivariate convolu-tional sparse coding (CSC) algorithm.

artificial intelligence, atom, machine learning, (15 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
North America > Canada > Quebec > Montreal (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)

Industry:

Health & Medicine > Therapeutic Area > Neurology (0.83)
Health & Medicine > Health Care Technology (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

Add feedback

Reviews: Multivariate Convolutional Sparse Coding for Electromagnetic Brain Signals

Neural Information Processing SystemsOct-7-2024, 11:28:16 GMT

This work extends convolutional sparse coding to the multivariate case with a focus on multichannel EEG decomposition. This corresponds to a non-convex minimization problem and a local minimum is found via an alternating optimization. Reasonable efficient bookkeeping (precomputation of certain factors, and windowing for locally greedy coordinate descent) is used to improve scalability. The locally greedy coordinate descent cycles time windows, but computes a greedy coordinate descent within each window. As spatial patterns are essential for understanding EEG, this multivariate extension is an important contribution.

greedy coordinate descent, multivariate convolutional sparse coding, waveform, (10 more...)

Neural Information Processing Systems

Industry:

Health & Medicine > Therapeutic Area > Neurology (0.52)
Health & Medicine > Health Care Technology (0.40)

Technology: Information Technology > Artificial Intelligence (0.33)

Add feedback

84b20b1f5a0d103f5710bb67a043cd78-Paper.pdf

Neural Information Processing SystemsOct-3-2024, 19:27:34 GMT

algorithm, asgcd, nesterov, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Long Beach (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > China (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.71)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.48)

Add feedback

Nearest Neighbor based Greedy Coordinate Descent

Neural Information Processing SystemsMar-14-2024, 22:43:40 GMT

Modem statistical estimators developed over the past decade have statistical or sample complexity that depends only weakly on the number of parameters when there is some structore to the problem, such as sparsity. A central question is whether similar advances can be made in their computational complexity as well. In this paper, we propose strategies that indicate that such advances can indeed be made. In particular, we investigate the greedy coordinate descent algorithm, and note that performing the greedy step efficiently weakens the costly dependence on the problem size provided the solution is sparse. We then propose a snite of methods that perform these greedy steps efficiently by a reduction to nearest neighbor search. We also devise a more amenable form of greedy descent for composite non-smooth objectives; as well as several approximate variants of such greedy descent. We develop a practical implementation of our algorithm that combines greedy coordinate descent with locality sensitive hashing. Without tuning the latter data structore, we are not only able to significantly speed up the vanilla greedy method, hot also outperform cyclic descent when the problem size becomes large. Our resnlts indicate the effectiveness of our nearest neighbor strategies, and also point to many open questions regarding the development of computational geometric techniques tailored towards first-order optimization methods.

algorithm, coordinate descent, descent, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas > Travis County > Austin (0.05)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)

Add feedback

Coordinate-wise Power Method Qi Lei

Neural Information Processing SystemsMar-12-2024, 14:14:29 GMT

In this paper, we propose a coordinate-wise version of the power method from an optimization viewpoint. The vanilla power method simultaneously updates all the coordinates of the iterate, which is essential for its convergence analysis. However, different coordinates converge to the optimal value at different speeds. Our proposed algorithm, which we call coordinate-wise power method, is able to select and update the most important k coordinates in O(kn) time at each iteration, where n is the dimension of the matrix and k apple n is the size of the active set. Inspired by the "greedy" nature of our method, we further propose a greedy coordinate descent algorithm applied on a non-convex objective function specialized for symmetric matrices. We provide convergence analyses for both methods. Experimental results on both synthetic and real data show that our methods achieve up to 23 times speedup over the basic power method. Meanwhile, due to their coordinate-wise nature, our methods are very suitable for the important case when data cannot fit into memory. Finally, we introduce how the coordinatewise mechanism could be applied to other iterative methods that are used in machine learning.

algorithm, dominant eigenvector, power method, (10 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > Canada > Ontario > Toronto (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

Add feedback

Asynchronous Parallel Greedy Coordinate Descent Yang You, + Ji Liu

Neural Information Processing SystemsMar-12-2024, 10:31:42 GMT

In this paper, we propose and study an Asynchronous parallel Greedy Coordinate Descent (Asy-GCD) algorithm for minimizing a smooth function with bounded constraints. At each iteration, workers asynchronously conduct greedy coordinate descent updates on a block of variables. In the first part of the paper, we analyze the theoretical behavior of Asy-GCD and prove a linear convergence rate. In the second part, we develop an efficient kernel SVM solver based on Asy-GCD in the shared memory multi-core setting. Since our algorithm is fully asynchronous--each core does not need to idle and wait for the other cores--the resulting algorithm enjoys good speedup and outperforms existing multi-core kernel SVM solvers including asynchronous stochastic coordinate descent and multi-core LIBSVM.

algorithm, coordinate descent, descent, (12 more...)

Neural Information Processing Systems

Country: